AIRWeb 2005 Proceedings

نویسندگان

  • Brian D. Davison
  • Ingmar Weber
  • Panagiotis T. Metaxas
  • Joseph DeStefano
چکیده

We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In contrast to other link spam filtering approaches, our method requires no training, no hard-coded rule sets, and no knowledge of complete-web connectivity. Preliminary experiments with identification of typical blog spam show promising results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Web Spam Taxonomy

Web spamming refers to actions intended to mislead search engines into ranking some pages higher than they deserve. Recently, the amount of web spam has increased dramatically, leading to a degradation of search results. This paper presents a comprehensive taxonomy of current spamming techniques, which we believe can help in developing appropriate countermeasures.

متن کامل

Pagerank Increase under Different Collusion Topologies

We study the impact of collusion –nepotistic linking– in a Web graph in terms of Pagerank. We prove a bound on the Pagerank increase that depends both on the reset probability of the random walk ε and on the original Pagerank of the colluding set. In particular, due to the power law distribution of Pagerank, we show that highly-ranked Web sites do not benefit that much from collusion.

متن کامل

Blocking Blog Spam with Language Model Disagreement

We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In contrast to other link spam filtering approaches, our method requires no training, no hard-coded rule sets, and no knowledge of complete-web connectivity. Preliminary experiments with identification of typical blog spam ...

متن کامل

SpamRank -- Fully Automatic Link Spam Detection

Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept of personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed pages have a biased distribution of p...

متن کامل

Optimal Link Bombs are Uncoordinated

We analyze the recent phenomenon termed a Link Bomb, and investigate the optimal attack pattern for a group of web pages attempting to link bomb a specific web page. The typical modus operandi of a link bomb is to associate a particular page with a search text and then boost that page’s pagerank. (The attacking pages can only control their own content and outgoing links.) Thus, when a search is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005